Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 100116 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 12.2 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 7 |
id is highly correlated with customer_id | High correlation |
customer_id is highly correlated with id | High correlation |
is_cardiologist is highly correlated with is_gp | High correlation |
is_gp is highly correlated with is_cardiologist | High correlation |
gender_female is highly correlated with gender_male | High correlation |
gender_male is highly correlated with gender_female | High correlation |
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_Office | High correlation |
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_Hospital | High correlation |
id is highly correlated with customer_id | High correlation |
customer_id is highly correlated with id | High correlation |
is_cardiologist is highly correlated with is_gp | High correlation |
is_gp is highly correlated with is_cardiologist | High correlation |
gender_female is highly correlated with gender_male | High correlation |
gender_male is highly correlated with gender_female | High correlation |
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_Office | High correlation |
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_Hospital | High correlation |
id is highly correlated with customer_id | High correlation |
customer_id is highly correlated with id | High correlation |
is_cardiologist is highly correlated with is_gp | High correlation |
is_gp is highly correlated with is_cardiologist | High correlation |
gender_female is highly correlated with gender_male | High correlation |
gender_male is highly correlated with gender_female | High correlation |
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_Office | High correlation |
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_Hospital | High correlation |
is_gp is highly correlated with rep_id and 3 other fields | High correlation |
rep_id is highly correlated with is_gp and 1 other fields | High correlation |
id is highly correlated with customer_id | High correlation |
gender_male is highly correlated with gender_female | High correlation |
office_or_hospital_based_Office is highly correlated with is_gp and 2 other fields | High correlation |
customer_id is highly correlated with id | High correlation |
gender_female is highly correlated with gender_male | High correlation |
office_or_hospital_based_Hospital is highly correlated with is_gp and 2 other fields | High correlation |
is_cardiologist is highly correlated with is_gp and 3 other fields | High correlation |
is_gp is highly correlated with is_cardiologist | High correlation |
gender_female is highly correlated with gender_male | High correlation |
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_Office | High correlation |
gender_male is highly correlated with gender_female | High correlation |
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_Hospital | High correlation |
is_cardiologist is highly correlated with is_gp | High correlation |
id is uniformly distributed | Uniform |
customer_id is uniformly distributed | Uniform |
email_open_total has 36890 (36.8%) zeros | Zeros |
f2f_total has 33974 (33.9%) zeros | Zeros |
prescription_total has 86554 (86.5%) zeros | Zeros |
webinar_total has 93792 (93.7%) zeros | Zeros |
Reproduction
| Analysis started | 2021-05-28 21:04:16.265159 |
|---|---|
| Analysis finished | 2021-05-28 21:04:34.087162 |
| Duration | 17.82 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 8343 |
|---|---|
| Distinct (%) | 8.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4172.636821 |
| Minimum | 0 |
|---|---|
| Maximum | 8348 |
| Zeros | 12 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 417 |
| Q1 | 2085 |
| median | 4172 |
| Q3 | 6261 |
| 95-th percentile | 7929 |
| Maximum | 8348 |
| Range | 8348 |
| Interquartile range (IQR) | 4176 |
Descriptive statistics
| Standard deviation | 2410.077866 |
|---|---|
| Coefficient of variation (CV) | 0.5775910938 |
| Kurtosis | -1.200370264 |
| Mean | 4172.636821 |
| Median Absolute Deviation (MAD) | 2088 |
| Skewness | 0.0005144702163 |
| Sum | 417747708 |
| Variance | 5808475.318 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 0 | 12 | < 0.1% |
| 5480 | 12 | < 0.1% |
| 1322 | 12 | < 0.1% |
| 7465 | 12 | < 0.1% |
| 5416 | 12 | < 0.1% |
| 3307 | 12 | < 0.1% |
| 1258 | 12 | < 0.1% |
| 7401 | 12 | < 0.1% |
| 5352 | 12 | < 0.1% |
| 3243 | 12 | < 0.1% |
| Other values (8333) | 99996 |
| Value | Count | Frequency (%) |
| 0 | 12 | |
| 1 | 12 | |
| 2 | 12 | |
| 3 | 12 | |
| 4 | 12 | |
| 5 | 12 | |
| 6 | 12 | |
| 7 | 12 | |
| 8 | 12 | |
| 9 | 12 |
| Value | Count | Frequency (%) |
| 8348 | 12 | |
| 8347 | 12 | |
| 8346 | 12 | |
| 8345 | 12 | |
| 8344 | 12 | |
| 8343 | 12 | |
| 8342 | 12 | |
| 8341 | 12 | |
| 8340 | 12 | |
| 8339 | 12 |
customer_id
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIFORM| Distinct | 8343 |
|---|---|
| Distinct (%) | 8.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4272.636821 |
| Minimum | 100 |
|---|---|
| Maximum | 8448 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 100 |
|---|---|
| 5-th percentile | 517 |
| Q1 | 2185 |
| median | 4272 |
| Q3 | 6361 |
| 95-th percentile | 8029 |
| Maximum | 8448 |
| Range | 8348 |
| Interquartile range (IQR) | 4176 |
Descriptive statistics
| Standard deviation | 2410.077866 |
|---|---|
| Coefficient of variation (CV) | 0.5640727182 |
| Kurtosis | -1.200370264 |
| Mean | 4272.636821 |
| Median Absolute Deviation (MAD) | 2088 |
| Skewness | 0.0005144702163 |
| Sum | 427759308 |
| Variance | 5808475.318 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2049 | 12 | < 0.1% |
| 3755 | 12 | < 0.1% |
| 7849 | 12 | < 0.1% |
| 5800 | 12 | < 0.1% |
| 3691 | 12 | < 0.1% |
| 1642 | 12 | < 0.1% |
| 7785 | 12 | < 0.1% |
| 5736 | 12 | < 0.1% |
| 3627 | 12 | < 0.1% |
| 1578 | 12 | < 0.1% |
| Other values (8333) | 99996 |
| Value | Count | Frequency (%) |
| 100 | 12 | |
| 101 | 12 | |
| 102 | 12 | |
| 103 | 12 | |
| 104 | 12 | |
| 105 | 12 | |
| 106 | 12 | |
| 107 | 12 | |
| 108 | 12 | |
| 109 | 12 |
| Value | Count | Frequency (%) |
| 8448 | 12 | |
| 8447 | 12 | |
| 8446 | 12 | |
| 8445 | 12 | |
| 8444 | 12 | |
| 8443 | 12 | |
| 8442 | 12 | |
| 8441 | 12 | |
| 8440 | 12 | |
| 8439 | 12 |
| Distinct | 153 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 154.2720844 |
| Minimum | 100 |
|---|---|
| Maximum | 252 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 100 |
|---|---|
| 5-th percentile | 102 |
| Q1 | 124 |
| median | 149 |
| Q3 | 185 |
| 95-th percentile | 218 |
| Maximum | 252 |
| Range | 152 |
| Interquartile range (IQR) | 61 |
Descriptive statistics
| Standard deviation | 36.60864628 |
|---|---|
| Coefficient of variation (CV) | 0.2372992264 |
| Kurtosis | -0.8447485583 |
| Mean | 154.2720844 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 0.3773270901 |
| Sum | 15445104 |
| Variance | 1340.192982 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 102 | 4320 | 4.3% |
| 145 | 2412 | 2.4% |
| 191 | 2100 | 2.1% |
| 147 | 2100 | 2.1% |
| 125 | 2064 | 2.1% |
| 129 | 1716 | 1.7% |
| 151 | 1620 | 1.6% |
| 124 | 1584 | 1.6% |
| 105 | 1512 | 1.5% |
| 153 | 1512 | 1.5% |
| Other values (143) | 79176 |
| Value | Count | Frequency (%) |
| 100 | 1092 | 1.1% |
| 101 | 948 | 0.9% |
| 102 | 4320 | |
| 103 | 396 | 0.4% |
| 104 | 372 | 0.4% |
| 105 | 1512 | 1.5% |
| 106 | 1476 | 1.5% |
| 107 | 1428 | 1.4% |
| 108 | 660 | 0.7% |
| 109 | 348 | 0.3% |
| Value | Count | Frequency (%) |
| 252 | 12 | < 0.1% |
| 251 | 12 | < 0.1% |
| 250 | 12 | < 0.1% |
| 249 | 12 | < 0.1% |
| 248 | 12 | < 0.1% |
| 247 | 36 | < 0.1% |
| 246 | 12 | < 0.1% |
| 245 | 12 | < 0.1% |
| 244 | 168 | |
| 243 | 12 | < 0.1% |
is_cardiologist
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 88080 | |
| 1 | 12036 | 12.0% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 88080 | |
| 0 | 12036 | 12.0% |
years_since_graduation
Real number (ℝ≥0)
| Distinct | 60 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.08845739 |
| Minimum | 3 |
|---|---|
| Maximum | 68 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 17 |
| median | 28 |
| Q3 | 37 |
| 95-th percentile | 46 |
| Maximum | 68 |
| Range | 65 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 12.36977458 |
|---|---|
| Coefficient of variation (CV) | 0.4566437433 |
| Kurtosis | -0.9051889658 |
| Mean | 27.08845739 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.007919249722 |
| Sum | 2711988 |
| Variance | 153.0113232 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 32 | 2964 | 3.0% |
| 31 | 2952 | 2.9% |
| 33 | 2940 | 2.9% |
| 35 | 2928 | 2.9% |
| 36 | 2856 | 2.9% |
| 30 | 2832 | 2.8% |
| 38 | 2736 | 2.7% |
| 26 | 2652 | 2.6% |
| 39 | 2640 | 2.6% |
| 11 | 2616 | 2.6% |
| Other values (50) | 72000 |
| Value | Count | Frequency (%) |
| 3 | 12 | < 0.1% |
| 4 | 504 | 0.5% |
| 5 | 1032 | 1.0% |
| 6 | 1608 | |
| 7 | 1812 | |
| 8 | 2388 | |
| 9 | 2424 | |
| 10 | 2220 | |
| 11 | 2616 | |
| 12 | 2124 |
| Value | Count | Frequency (%) |
| 68 | 12 | < 0.1% |
| 63 | 24 | < 0.1% |
| 60 | 24 | < 0.1% |
| 59 | 60 | 0.1% |
| 58 | 60 | 0.1% |
| 57 | 48 | < 0.1% |
| 56 | 120 | |
| 55 | 192 | |
| 54 | 240 | |
| 53 | 288 |
time_window_id
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.5 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3.75 |
| median | 6.5 |
| Q3 | 9.25 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 5.5 |
Descriptive statistics
| Standard deviation | 3.45206977 |
|---|---|
| Coefficient of variation (CV) | 0.5310876569 |
| Kurtosis | -1.216784055 |
| Mean | 6.5 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0 |
| Sum | 650754 |
| Variance | 11.9167857 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 8343 | |
| 2 | 8343 | |
| 3 | 8343 | |
| 4 | 8343 | |
| 5 | 8343 | |
| 6 | 8343 | |
| 7 | 8343 | |
| 8 | 8343 | |
| 9 | 8343 | |
| 10 | 8343 | |
| Other values (2) | 16686 |
| Value | Count | Frequency (%) |
| 1 | 8343 | |
| 2 | 8343 | |
| 3 | 8343 | |
| 4 | 8343 | |
| 5 | 8343 | |
| 6 | 8343 | |
| 7 | 8343 | |
| 8 | 8343 | |
| 9 | 8343 | |
| 10 | 8343 |
| Value | Count | Frequency (%) |
| 12 | 8343 | |
| 11 | 8343 | |
| 10 | 8343 | |
| 9 | 8343 | |
| 8 | 8343 | |
| 7 | 8343 | |
| 6 | 8343 | |
| 5 | 8343 | |
| 4 | 8343 | |
| 3 | 8343 |
conference_total
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 428 |
| 3 | 11 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 90525 | |
| 1 | 9152 | 9.1% |
| 2 | 428 | 0.4% |
| 3 | 11 | < 0.1% |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.995505214 |
| Minimum | 0 |
|---|---|
| Maximum | 7 |
| Zeros | 36890 |
| Zeros (%) | 36.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 0.9942147861 |
|---|---|
| Coefficient of variation (CV) | 0.9987037458 |
| Kurtosis | 0.913514897 |
| Mean | 0.995505214 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9843159179 |
| Sum | 99666 |
| Variance | 0.988463041 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 36929 | |
| 0 | 36890 | |
| 2 | 18391 | |
| 3 | 6046 | 6.0% |
| 4 | 1533 | 1.5% |
| 5 | 286 | 0.3% |
| 6 | 32 | < 0.1% |
| 7 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 36890 | |
| 1 | 36929 | |
| 2 | 18391 | |
| 3 | 6046 | 6.0% |
| 4 | 1533 | 1.5% |
| 5 | 286 | 0.3% |
| 6 | 32 | < 0.1% |
| 7 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 7 | 9 | < 0.1% |
| 6 | 32 | < 0.1% |
| 5 | 286 | 0.3% |
| 4 | 1533 | 1.5% |
| 3 | 6046 | 6.0% |
| 2 | 18391 | |
| 1 | 36929 | |
| 0 | 36890 |
| Distinct | 23 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.90197371 |
| Minimum | 0 |
|---|---|
| Maximum | 22 |
| Zeros | 33974 |
| Zeros (%) | 33.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 22 |
| Range | 22 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.155378666 |
|---|---|
| Coefficient of variation (CV) | 1.133232628 |
| Kurtosis | 5.601702041 |
| Mean | 1.90197371 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.808003049 |
| Sum | 190418 |
| Variance | 4.645657194 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 33974 | |
| 1 | 18825 | |
| 2 | 16083 | |
| 3 | 12435 | 12.4% |
| 4 | 7979 | 8.0% |
| 5 | 4653 | 4.6% |
| 6 | 2516 | 2.5% |
| 7 | 1449 | 1.4% |
| 8 | 850 | 0.8% |
| 9 | 480 | 0.5% |
| Other values (13) | 872 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 33974 | |
| 1 | 18825 | |
| 2 | 16083 | |
| 3 | 12435 | 12.4% |
| 4 | 7979 | 8.0% |
| 5 | 4653 | 4.6% |
| 6 | 2516 | 2.5% |
| 7 | 1449 | 1.4% |
| 8 | 850 | 0.8% |
| 9 | 480 | 0.5% |
| Value | Count | Frequency (%) |
| 22 | 2 | < 0.1% |
| 21 | 4 | < 0.1% |
| 20 | 5 | < 0.1% |
| 19 | 6 | < 0.1% |
| 18 | 18 | < 0.1% |
| 17 | 26 | < 0.1% |
| 16 | 29 | < 0.1% |
| 15 | 43 | |
| 14 | 69 | |
| 13 | 95 |
| Distinct | 32 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3535898358 |
| Minimum | 0 |
|---|---|
| Maximum | 42 |
| Zeros | 86554 |
| Zeros (%) | 86.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 42 |
| Range | 42 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.291228808 |
|---|---|
| Coefficient of variation (CV) | 3.651770152 |
| Kurtosis | 112.2690004 |
| Mean | 0.3535898358 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.950297873 |
| Sum | 35400 |
| Variance | 1.667271836 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 86554 | |
| 1 | 4981 | 5.0% |
| 2 | 4108 | 4.1% |
| 3 | 1786 | 1.8% |
| 4 | 1031 | 1.0% |
| 5 | 565 | 0.6% |
| 6 | 339 | 0.3% |
| 7 | 199 | 0.2% |
| 8 | 151 | 0.2% |
| 9 | 99 | 0.1% |
| Other values (22) | 303 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 86554 | |
| 1 | 4981 | 5.0% |
| 2 | 4108 | 4.1% |
| 3 | 1786 | 1.8% |
| 4 | 1031 | 1.0% |
| 5 | 565 | 0.6% |
| 6 | 339 | 0.3% |
| 7 | 199 | 0.2% |
| 8 | 151 | 0.2% |
| 9 | 99 | 0.1% |
| Value | Count | Frequency (%) |
| 42 | 1 | < 0.1% |
| 40 | 1 | < 0.1% |
| 35 | 1 | < 0.1% |
| 34 | 2 | < 0.1% |
| 30 | 3 | < 0.1% |
| 29 | 5 | |
| 26 | 2 | < 0.1% |
| 25 | 2 | < 0.1% |
| 23 | 10 | |
| 22 | 3 | < 0.1% |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.08218466579 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 93792 |
| Zeros (%) | 93.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 782.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3676987394 |
|---|---|
| Coefficient of variation (CV) | 4.474055299 |
| Kurtosis | 69.75802141 |
| Mean | 0.08218466579 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.790673379 |
| Sum | 8228 |
| Variance | 0.135202363 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 93792 | |
| 1 | 5061 | 5.1% |
| 2 | 876 | 0.9% |
| 3 | 251 | 0.3% |
| 4 | 67 | 0.1% |
| 5 | 41 | < 0.1% |
| 7 | 14 | < 0.1% |
| 6 | 11 | < 0.1% |
| 8 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 93792 | |
| 1 | 5061 | 5.1% |
| 2 | 876 | 0.9% |
| 3 | 251 | 0.3% |
| 4 | 67 | 0.1% |
| 5 | 41 | < 0.1% |
| 6 | 11 | < 0.1% |
| 7 | 14 | < 0.1% |
| 8 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 7 | 14 | < 0.1% |
| 6 | 11 | < 0.1% |
| 5 | 41 | < 0.1% |
| 4 | 67 | 0.1% |
| 3 | 251 | 0.3% |
| 2 | 876 | 0.9% |
| 1 | 5061 | 5.1% |
| 0 | 93792 |
gender_female
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
gender_male
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 65616 | |
| 0 | 34500 |
office_or_hospital_based_Hospital
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 88980 | |
| 1 | 11136 | 11.1% |
office_or_hospital_based_Office
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 782.3 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 100116 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100116 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 100116 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100116 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 88980 | |
| 0 | 11136 | 11.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | customer_id | rep_id | is_cardiologist | is_gp | years_since_graduation | time_window_id | conference_total | email_open_total | f2f_total | prescription_total | webinar_total | gender_female | gender_male | office_or_hospital_based_Hospital | office_or_hospital_based_Office | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 100 | 100 | 0 | 1 | 23 | 1 | 0 | 0 | 4 | 0 | 0 | 1 | 1 | 0 | 1 |
| 1 | 0 | 100 | 100 | 0 | 1 | 23 | 2 | 0 | 0 | 2 | 0 | 0 | 1 | 1 | 0 | 1 |
| 2 | 0 | 100 | 100 | 0 | 1 | 23 | 3 | 0 | 1 | 4 | 0 | 0 | 1 | 1 | 0 | 1 |
| 3 | 0 | 100 | 100 | 0 | 1 | 23 | 4 | 1 | 0 | 8 | 0 | 0 | 1 | 1 | 0 | 1 |
| 4 | 0 | 100 | 100 | 0 | 1 | 23 | 5 | 0 | 3 | 8 | 0 | 0 | 1 | 1 | 0 | 1 |
| 5 | 0 | 100 | 100 | 0 | 1 | 23 | 6 | 0 | 0 | 6 | 3 | 2 | 1 | 1 | 0 | 1 |
| 6 | 0 | 100 | 100 | 0 | 1 | 23 | 7 | 0 | 1 | 4 | 0 | 2 | 1 | 1 | 0 | 1 |
| 7 | 0 | 100 | 100 | 0 | 1 | 23 | 8 | 0 | 0 | 7 | 0 | 3 | 1 | 1 | 0 | 1 |
| 8 | 0 | 100 | 100 | 0 | 1 | 23 | 9 | 1 | 0 | 9 | 0 | 1 | 1 | 1 | 0 | 1 |
| 9 | 0 | 100 | 100 | 0 | 1 | 23 | 10 | 0 | 3 | 10 | 0 | 1 | 1 | 1 | 0 | 1 |
Last rows
| id | customer_id | rep_id | is_cardiologist | is_gp | years_since_graduation | time_window_id | conference_total | email_open_total | f2f_total | prescription_total | webinar_total | gender_female | gender_male | office_or_hospital_based_Hospital | office_or_hospital_based_Office | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 100106 | 8348 | 8448 | 115 | 0 | 1 | 3 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100107 | 8348 | 8448 | 115 | 0 | 1 | 3 | 4 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100108 | 8348 | 8448 | 115 | 0 | 1 | 3 | 5 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100109 | 8348 | 8448 | 115 | 0 | 1 | 3 | 6 | 0 | 3 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100110 | 8348 | 8448 | 115 | 0 | 1 | 3 | 7 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100111 | 8348 | 8448 | 115 | 0 | 1 | 3 | 8 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100112 | 8348 | 8448 | 115 | 0 | 1 | 3 | 9 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 |
| 100113 | 8348 | 8448 | 115 | 0 | 1 | 3 | 10 | 0 | 0 | 3 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100114 | 8348 | 8448 | 115 | 0 | 1 | 3 | 11 | 0 | 1 | 4 | 0 | 0 | 1 | 1 | 0 | 1 |
| 100115 | 8348 | 8448 | 115 | 0 | 1 | 3 | 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 |